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ABSTRACT 


A concept for an apparatus which visually displays and responds 
to the first and second formant of vowel sounds is developed. The 
machine is intended for use by deaf and speech handicapped children 
in learning to produce voiced sounds. System design and principles 
applied to realize a physical prototype of this concept are presented. 
The complete electronic and mechanical design plus fabrication of the 
automatic electronic speech training responder is described in 
detail. Schematic diagrams of all electronic circuitry employed 
and photographs of the prototype equipment are included. The 
apparatus is on loan to the Monterey Institute for Speech and 


Hearing, Monterey, California, for clinical testing and evaluation. 
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1. INTRODUCTION 

Man is born with the natural instinct and physical capacity 
to eat and breathe, but he must learn how to speak. This learning 
process depends on good hearing ability during the formative years. 
A child, during initial attempts to speak, constantly monitors his 
utterances with his ears. These sensors provide the necessary 
information to the brain to modify the vocal tract modulators and 
articulators with respect to the points of articulation until the 
desired sound, phoneme or word is correctly produced. If this 
feedback loop (voice output-ear sensor-brain input) is defective 
or nonexistent in a human, it is necessary that another physical 
sensor must be used as an alternate feedback path to monitor 
generated speech sounds on a real time basis if intelligent and 
comprehensible communication is to be achieved. Many devices have 
been devised and constructed which transform speech sounds into 
a visual display or a tactile signal. 

This thesis is directed toward the attempt to process 
specific speech sounds and to display or provide a positive response 
when the desired sound has been correctly produced. In addition, 
the machine must be simple in final output so that it can be easily 
used and interpreted by children. 

Computer sciences have stimulated research into speech recog- 
nition and synthesization. Unfortunately, this type of engineering 
technology is too costly and sophisticated at the present time for 
application to elementary speech training problems. Rather specific 
guide lines on needs of training devices for children were 
developed by Dr. Burl Gray of the Monterey Institute for Speech 


and Hearing; these are: 
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1. A definite need exists for simple, inexpensive devices 
which will assist or supplement the speech therapist's 
work with deaf children. These devices would permit the 
instructor to teach more students simultaneously or the 
devices could perform elementary tasks of providing 
various mechanical responses to repetitious articulation 
drills without the constant attention or intervention of the 
speech therapist. 

2. The information display or mechanical response of such an 
apparatus must be in a form which is easily communicable 
to and understood by the child. Careful attention must 
be given to the human-machine interface problem to insure 
good results with a given age group and mental attitude. 

3. The apparatus must present the visual or mechanical 
response while the child is speaking (i.e. real time). 

Using these criteria, an attempt has been made to design and 

construct an apparatus which will respond only to a defined pro- 
nounciation of the basic American vowel sounds. The vowel sounds 
were selected for machine recognition because they require the 
minimum amount of audio spectral information to be uniquely 
identified, However, the approach to this vowel processing 
technique is sufficiently general, It may have possible extensions 
to process other sounds. 

Figure l is a graphic representation of a generalized man- 


machine speech feedback system. 
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2, THE VOWEL SOUND 

A human can produce a multitude of speech sounds by controlling 
his articulators (the tongue and lower lip), the points of articula- 
tion (the upper lip, alveolar ridge, the hard palate, the soft 
palate or velum and lower teeth), and the excitation of his vocal 
bands. The vocal bands, if tensed and therefore vibrating, modulate 
the air stream exhaled from the lungs to establish a category of 
sounds which are classed as voiced sounds. All vowels are voiced 
sanndsswhich are excluded from entry into the nasal cavity by a 
raised velum and therefore eminate solely from the oral cavity. 

It will become apparant that the vowels are constrained to a 
small category of speech sounds by definition of the manner in 
which they are articulated. In fact, the basic American vowels 
consist of 10 phonemes. Tabäe 1 lists the individual sounds with 
their phonetic notation and representative words. [10,30,33] 

Since this thesis is devoted to application of electronic 
techniques to speech processing, it is natural to begin with a 
machine which will react to the most fundamental sounds which 
require the minimal spectral information to be recognized or 
identified. The vowel can be specified by a minimum of two spectral 
parameters in most sound situations. Joint discussions with Dr. 
Gray and Dr. Ewing resulted in establishing a mutually acceptable 
concept of an electronic vowel teaching machine. This local merger 
of ideas from two disciplines proves once again that scientific 
boundaries can greatly overlap and the systems engineering approach 
to problems may be of great benefit to all concerned. 

The theory of vowel production can be described in terms of 


steady state (or harmonic) conditions with application of 
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TABLE 1 


VOWEL PHONETIC SYMBOLS AND REPRESENTATIVE WORDS 


Typewritten 
Symbol for 
Vowel 


rd 





N 





Frequency (Hg) 
= 


© 


¿ r e à 2U A 3 


Figure 2. Typieal Speetregrams of the vowels by a 
male voice. 
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cord-tone-resonance effects. [21] Modern analytical representation 
of the same effect can be stated in terms of excitation functions 
and convolution techniques. [36] The former description states in 
effect th^t the vocal bands ( a modern svnonym for cords [3] D 
during shonation, set up in the air imieciately adjacent to them 

a complex motion which consists of a iuindamental component, known 
as pitch, and a large number of its overtones or harnonics. This 
complex air motion constitutes the so-called band-tone. The 

theory further states that the vocal cavities, on which the bande 
tone acts as a force, have the properties of simple resonators and 
thus serve to modify the spectrum of energy flowing from the bands. 
In terms of this theory, a vowel sound, as emitted from the mouth, 
is due to both selective generation and selective transmission plus 
radiation. This sound is composed mainly of harmonic components 

of the fundamental each ef which has a determinable magnitude. 

For example, the greatest magnitudes cf the harmonic components 
usually are found to exist for the 6th through 9th component and 
13th through 16th component for the particular vowel sound /a/. [21] 
Naturally, for other vowels, the oral cavities change in physical 
dimensions thus affecting the resonant properties of these chambers 
and hence causing other harmonic components or partials of the 
fundamentali vibration of the vocal bands to be amplified or atten- 
uated, 

The svectograph has greatly enhanced the study of speech 
sounds and in particular vividly identifies the amplified partial 
tones or resonant frequencies uniquely identifiable with each vowel 
sound. [32,33] Figure 2 provides a sketch representing the 


spectrographic tracings due to each vowel sound. The dark areas 
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represent the amplified harmonics of the fundamental pitch of the 
voice. Note that these locations are unique for each vowel, 
especially for the first and second resonant frequencies. In the 
terminology of visible speech the dark bands are called "formant! 
regions or "bars!" and for reference purposes are designated by 
number, the lowest on the frequency scale being bar 1 or first 
formant Fl, the next bar 2 or second formant F2, etc. In this 
thesis, the notation Fl and F2 shall be used to designate the 
first and second resonant frequencies of vowel sounds respectively. 

The first and second formants are the only two pieces of 
spectral information required, in most cases, to identify a particular 
vowel, The third formant (F3) is helpful in distinguishing between 
overlapping first and second formant frequencies. Potter and 
Peterson have suggested that the human ear recognizes vowel sounds, 
not by the spectral location of Fl and F2, but rather by the relative 
frequency separation or difference between Fl and F2. [33] Table 2 
lists the Fl, F2 and F3 frequencies for the vowels of Table 1 while 
Table 3 lists the relative formant amplitudes. [ 30] Figure 3 shows 
a two dimensional plot of Fl vs F2. [οἱ This figure is the crux 
of the apparatus designed to recognize vowel sounds. Note that 
in the Fl-F2 plane each vowel has a specific location; also it 
is interesting to note that the locations of these sounds corresponds 
roughly to the position of the tongue in the oral cavity if you 
imagine looking at a side view of the head. 

The vowel training device does not work on the relative location 
of Fl to F2 but rather utilizes an electronic spectral window in the 
Fl-F2 plane to target a particular vowel sound or for that matter, 
any voiced combination of two oral resonances in this dual formant 


plane. 
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TABLE 2 


AVERAGES OF FUNDAMENTAL AND FORMANT FREQUENCIES 


E 


Symbol 


> £ Q S S G miw |=. 


Fundamental First Second 

Frequency (Hz) | Formant(Hz) | Formant(Hz) 
M 136 270 2290 
W 235 310 2770 
Ch 272 370 3200 
M 1795 390 1990 
W 232 430 2480 
Ch 269 530 2730 
M 130 530 1840 
W 223 610 2330 
Ch 260 690 2610 
M 127 660 1720 
W 210 860 2050 
eh 251 1010 2320 
M 124 730 1090 
W 212 850 1220 
Ch 256 1030 1370 
M 129 570 840 
W 216 590 920 
Ch 263 680 1060 
M 137 LLO 1020 
W 232 470 1160 
Ch 276 560 1410 
M 141 300 870 
W 231 370 950 
Ch 274 430 1170 
M 130 640 1190 
W 221 760 1400 
Ch 261 850 1590 
M 133 490 1350 
W 218 500 1640 
Ch 261 560 1820 
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Third 
Formant( Hz ) 


3010 
3310 
3730 


2550 
3070 
3600 


2480 
2990 
3570 


2410 
2850 
3320 


2440 
2810 
3170 


2410 
2710 
3180 


2240 
2680 
3310 


2240 
2670 
3260 


2390 
2780 
3360 


1690 
1960 
2160 


TABLE 3 


FORMANT AMPLITUDES MEASURED RELATIVE TO /3/ 


IPA First Second Third 
Symbo1 Formant (db) Formant (db) Formant (db) 
ΠΠ | | > ` 
-23 







-17 -24 
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F2 (Hz) 


3K 





2K 
1K 
500 | 
400 
200 250 500 1K 
F1 (Hz) 


Figure 3. Central Regions of First and Second Formant 
Frequencies of the Common American Vowels. 
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3. AESTR AS A TEACHING AID 

During the process of physicaliy realizing a prototype of the 
vowel training machine, liberty was taken by the author and his 
associates in the electronics laboratory to coin a name for this 
device. The result is Automatic Electronic Speech Teaching 
Responder. The title should convey the notion that this machine 
is not intended to replace a speech therapist but rather assist 
him in his work.  AESTR wiil be initially preset through the 
oscillator frequency dials and the control knobs located on the 
front panel of the device. Now the child is placed in a room with 
a candy dispenser or some other motivational responder and a 
microphone. He is asked to make any sound he cares to. As the 
child produces various sounds he should produce the desired sound 
in due time. The machine will only activate the candy dispenser when 
the child has produced the targeted voiced sound and the child 
will keep trying to repeat the sound in order to maximize his 
reward, As the rewards are given more frequently, the teacher is 
able to adjust the filter bandwidths on AESTR and narrow the 
spectral window of the desired sound, hence increasing the 
articulation accuracy required of the child if he is to obtain 
his reward. 

The child learns to speak desired sounds by communicating 
directly with AESTR. However, positive control of the speech 
training process is available to the teacher by his ability 
to vary six parameters from the front panel of AESTR. (Fl 
bandwidth and sensitivity, F2 bandwidth and sensitivity, pitch 


filter, microphone gain). 


23 


l, SYSTEM CRITERIA AND DESIGN 

The system incorporates two basic electronic functions to 
locate sn? measure, in reai tine, the first and second formants of 
a voiced sound. The speech sound is first mixed with two local 
osciilators by means of non-iinear devices. One of the components 
obtsineo from this process, the cifference frequencies between 
the formants and oscillators, is isolated by active low pass 
filter circuits, The oscillators and low pass filters are 


+ 


variable and can be set for a particular sound or spectral 
window, By setting the two local oscillators to the known fre- 
qiencies for Fl and F2 of a particular vowel and the low pass 
cutoff frequency for the desired degree of accuracy of response, 
the machine is able to process the speech sound and provide a 
binary decision response, 
Ihe responses are: 
lo A positive response which is movement of two voltmeter 
imiicators and a light being activated if both meters 
are at maximum value simultaneously. This condition occurs 
when the resonant frequencies of the voice correlate 
with the preset local oscillator frequencies simui- 
taneously. The correct voiced sound is being produced 
by the student. The apparatus also has an external 
motivation output jack which can operate other reward 
machines when the targeted sound is produced by the 
student. 
2. No response. One or both formant frequencies are not 


present or they do not correlate within limits set by 


the filter pass band, 
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Many methods were considered for realization of this device in 
terms of simplicity, cost and expediency. Primary concern was to pro- 
duce some type of primitive machine which would do the basic tasks re- 
quired by this particular vowel teaching aid. The approach finally 
selected for the first attempt is to process the complex speech wave- 
form in analog form in the audio specturm. Advantage was taken of 
the Field-Effect Transistor(FET) which has an almost perfect square 
law response and which is ideally suited for optimum mixing of oscil- 
lator and voice frequencies. The filtering. is accomplished by means 
of active low-pass filters using the readily available integrated cir- 
cuit operational amplifiers. 

An additional factor must be considered in AESTR's system design. 
The pitch of a human voice can range from approximately 75 to 500 Hz. 
(32 | The formant frequencies range from approximately 250 to 3000 Hz. 
It is necessary to eliminate the pitch frequency from the audio speech 
prior to the mixing operation, otherwise it is possible for the pitch 
or fundamental frequency to pass directly through the mixers and 
filters thus producing a positive machine response regardless of the 
formant and oscillator frequencies present. Figure 4 represents the 


basic system approach for realization of this apparatus. 
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Miorophone 


rreamplifier 


Piteh Attenuator 


(variable high pass filter) 





FIRST FORMANT SECOND) FORMANT 
Local Oscillator Local Oscillator 










Low pass filter 





Low pass filter 







(variable 10, 15, 30 and 6C Hz) 





(variable 10, 15, 30 and 60 Hz) 


Visual Indicator Visual Indicator 


(0-19 V, Voltmeter) (0-10 V. Voltmeter) 





Binary Decision 


(are both outputs at maximum?) 
YES 
"Correct" 


(green light) 
activated 


External Motivation 


activated 





Figure 4.  AESTR Electronic Transducer, Detector, and Display System 


26 


t, CONTROL rANEL DESIGN AND OPERATION 

ASTR is to be operated by individuals who do not possess 
an engineering background, Therefore the panel is designed 
to be self explenitory and requires minimum instruction for 
operation. The controls are fairly large to permit positive 
grasp by the operator. Also functional location of the knobs 
and visual indicators is evident by the partition lines. The 
objective is to have the panel functions reflect the needs of 
the operator rather than the requirements of the internal cir- 
cuitry.  AESTR's control panel is shown in Figure 5, 

The "volume" control is self explanitory and permits the 
operator to vary the gain of the preamplifier circuit. 

The "pitch" control permits selection of four cutoff fre- 
quencies of the highepass filter circuit in order to suppress 
the fundamental frequency of a voice while passing all the 
formant frequencies. In Table 4 below, the letter positions 
are identified with the 3 db cutoff frequencies of the high- 
pass filter. 

TABLE 4 
HIGH=PASS FILTER CUTOFF FREQUENCIES 


Frequency (Hz) 


?5 
190 


Ὁ 


Pitch control position 












A (male voice) 

B (female voice) 1 
C (child's voice) 450 
(special use) 
The pitch control setting is not critical for the back vowel 


sounds such as OW in the word "father" and can remain in the 


"A" op "B" setting for all speakers regardless of sex or age. 
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Position "D" is used when working with the central vowel such 
as ER in the word "bird". It is necessary to suppress the fre- 
quencies below 1 KHz and operate with the second and/or third 
formants in order to have the machine only respond to this 
particular vowel. This technique was developed during the 
testing of AESTR and is discussed further in section 13. 

The "Fl LPF cutoff" and "F2 LPF cutoff" are variable cutoff 
low=pass filters. The controls are located above the first 
and second formant voltmeter indicators respectively. Cutoff 
frequencies of 10, 15, 30 and 60 Hz are printed around the 
periphery of the control knobs. Normally the controls are 
initially set in the 60 Hz position when searching for voice 
formant. This setting provides the widest possible filter 
pass band, such that the voltmeter needle will begin to deflect 
up seale whenever the oscillator and voice formant are within 
+ 60 Hz of each other. As the two frequencies become more near- 
ly eoincident, the voltmeter needle will show a maximum scale 
deflection. When the operator has the oscillator set at a 
frequency which gives the maximum needle deflection, he may 
elect to switch the "LPF cutoff" control to 30 Hz in order to 
narrow the filter response pass band. It may be necessary 
to readjust the local oscillator slightly for maximum scale 
deflection. This procedure can be continued for the 15 and 
10 Hz cutoff frequencies respectively. 

The "Fl Filter Sensitivity" and "F2 Filter Sensitivity" 
controls vary the gain of the filters. The word "sensitivity" 
is chosen for contrast against the "volume" control nomenclature 


and is intended to prevent any misunderstanding between the two 
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types of controls, The Wfirltezrsr ssnsltiyiiy) controls are ade 
justed to maks the veltuster needle celica to Iuli scale when 
tne local oscislater and formant freuusntiss most vcarly Coe 
ineice, ach vowei sounc wilh have its unique "filter sensie 
tivity" setting due to the varying intensity levels of the 
fornants of the inolvicual paenemes, Tho operetor must 
these setiines enmpiricaliy since the sensitivity is alse a 
function of the intensity of the speakers voles, It is ade 
visable to keap ihe "volume" control knoo st à minimum setting 
an! the "sensitivity" eontrol knobs &í « Kirk setting to reduce 


» + 


the effects of acoustíe and eilsctirical n5ise, 


ay 
ES 


Ihe "correct" gr light illuminates whea both formant 
indicators read an up Seals deflaction of 7 veite. Light 


activation is delayed 250 miíilisccids ano ose Lighted, stavs 
4 Š ? 


on for a period of 2 seconds. The deiay prevents the light 


Bata 


from being activated by transient fuli scale deflections which 


occur from plosive type consonant sounds prececins a vowel 
in such a word as "bar", The light hold time ol 2 seconds 
prevents the light from flickering if the voices begins to 

quiver during articulation of a phoneme, 

In the rear of the AESTR cabinet is located an ordinary 
female 115 volt resentacle. Any external motivational cevice, 
such as an M&M cancy dispenser, can be attached to this ter- 
minal and will be operated sutomatically since The terminal 
provides 115 volts only during the interval. when the "correct" 
light is illuminated. 


ANSTR also has tho capability of measuring the pitch of 


2 personis voics, Turn the Witch" control clockwise beyond 
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pa "UU rabutcontenti. 11 steps. Det Lue "PI LET rita" control 
to 19 Ha and the "Fl Sensitivity" control to a maximum value of 
10. Turn the "F2 Sensitivity" full counterclockwise to a vaiue 
of 0. Sweep ius Fl local oscillator through a range of 60 to 
500 Hz. The speaker's piteh will be read on the Fl oscillator 


frequen settinz when the first formant visual indicator has 
eq E 


a maximum up-seale deflection. 
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6, PREAMPLIFIER DESIGN 

The mixer circuit is able to accept a maximum input signal 
of 0.8 volts peak to peak. A preamplifier is necessary, ..es- 
pecially if a dynamic microphone is being used, to amplify the 
voice sound for maximum mixer output. The Fairchild uA709 
operational amplifier was selected to perform this function 
untilizing the standard feedback configuration and necessary 
frquency compensation. It is shown schematically in Figure 6. 

The uA709 comes in an epoxy TO-5 configuration . The.de- 
tailed circuitry employed in thés integrated gircuit and -- 
formance data are readily available from the manufacturer, |11| 
The price of this device ig not considered to be excessive 


at the present time. 
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+ 15 y 


s Od uf 


1.5K 


580 
pf 


l, 3K 2 7 
30 uf 


shaf 


A3 
“icrochone 
input To 
high-pass 
filter 


"Volume" control 





SM 


"ote: Letter-nımber combination inside square indicates 
circuit board by the letter and the terminal of the board 
by the number. 


Figure 6 
Preamplifier Schematic 
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7. MIXER CIRCUIT 
The transfer characteristic of ficid-efíect transistor 
(237), made by the diffusion process, has a square=slaw rela- 


tionship between the drain to source current, Ins an3 the gate 


to source voltage, ὅρε,. It 15 expressed as [25] 


bu 


Ias = Ipss (| 1 < Vesf νι ) (ij 


where ings is the saturation drain current when ine gate is shorte? 
to the source ( Veg = 0 ) and Vo 15 the pinch off voltage. ror 
mixer operation lət Ves be represented as the sui of tws time 
soidal voltages both or which can be simultaneously imorossec 

on the gate of an Fat or one impresses on the gate awd ths 


other on the source of the PET. Ather process will cause 


mixing operation and Ves as defined below holcs truc For both 


cases 

Vgs = Vgs + Vs cos ws% + Vo cos wy? (2 
where Vas is the bias gate to source voltage. Vs cos wst 
represents the source or voice sound while Va cos wot is the 
sinusoid generated by the local oscillator which is apolied 
to the gate or source of the Fat, Substituting (2) into (ij 
and expanding, we obtain 

Z + Vo? (3) 


los = loss | V2 + Vase + 3Vs* + $V5 
Vp m | 
22V, - Vas) (YMacos gt + Vacos προ) 


+5 Vet cos ?Wwgt + $ Vy“ cos Zo! 

+ Ve Yo[ cos (us + Wo ) Š - cos(we - wot} } 
Ihe drain current has DC comoonents plus six individual fre- 
quencies as a resule Hof the square law mixing of an FHT. This 
response shows that only rrequencies of the fort We, Wo, 2Ws, 


Wo. Ws + Wy, and we - w, are obtainer while other frequencies 
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of the form mw. t nw, which must be suppressed in conventional 
mixers, are greatly reduced with an FET mixer circuit, [2 | 

The frequency component of the drain current which is of 
interest is VsVocos(ws = wo)t. It is separated from the other 
components by coupling to the mixer output a DC blocking capa- 
citor followed by a low-pass filter which has a cut off fre- 
quency wf;j) such that ws = wo < Wf3] < both ws and wo. 

Initially, a dual gate metal oxide semiconductor (MOS) 
FET was selected as being particularly well suited for use in 
mixing two audio frequencies. Fight 3N141 MOS-FET's were ordered 
but due to excessive delay in receipt of these devices, it was 
necessary to design and build a mixer using a single gate 
FET already in stock in the school electronics issue room. 
This device requires that the voice signal be impressed on the 
gate while the local oscillator signal is applied to the source 
terminal. Several types of FET's available from the issue room 
were tested for mixing action in the circuit shown in figure 7a. 
The 2N3819 proved to be the most satisfactory device. Its 
transconductance as a function of gate to source voltage is 
quite linear over the range from zero Vas to pinch off voltage 
Vp. This characteristic enhances the mixing action of an FET. [20] 

The local oscillator used in AESTR is a URM-127 signal 
generator. It has an output impedance of approximately 100 
ohms and can deliver a signal ranging from the microvolt range 
to a maximum of 10 volts. 

In designing the mixer circuit the author relied on the 
manufacturer's data sheet for the 2N3819 FET. It is an Ne 


channel device with Vp = -8 volts, Ipss = 10 ma and an average 
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AESTR Mixer Schematic 


36 


transconductance of 4000 micromhos for zero gate bias, The DC 
drain current Ip was selected to be 1 ma. and the mixer circuit 
was designed to give a voltage gain of 10. These conditions 
were incorporated into the circuit design (25| and values were 
obtained such that Rg = 5.5 kohms and Rp = 8.1 kohms. The 
network in Figure 7a was constructed and the components sub- 
sequently modified to the circuit of Figure 7b to obtain 
optimum mixing. 

Successful mixing of any two audio frequencies is accom- 
plished by means of this circuit with no lower limit on the 
input and local oscillator voltages. An upper limit of 1.5 
volts peak to peak for the signal and local oscillator voltages 
cannot be exceeded; otherwise the output is clipped. Optimum 
operation of this mixer circuit is set for an input of 
approximately 0.8 volts peak to peak. Above this voltage, 
the follow-on filter circuits begin to give spurious outputs 
due to sweeping of either the voice oscillator or local os- 
cillator across the frequency spectrum. This effect is notice- 
able on the Fl and F2 voltmeter indicators and masks the fre- 
quency response of the filters. 

The 2N3819 FET's have consistently performed the mixing 
operation on a daily basis during the entire period covering 
the design and testing of the formant indicators. These par- 
ticular FET's are highly recommended both for their reliability 
and usefulness in audio mixing circuits. 

As an epilog to the mixer design realization, the 3N141 
MOS-FET's did arrive finally. Other students have had limited 


success in using these devices for mixing. Special care must 
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be exercised in using them, especially with regard to preventing 
any extermal high voltages (static charges, ete,) from acoi- 


dentally damaging the devices. 


38 


9, FILTER CONSIDSRATIONS 

Many types of network designs will yield either low-pass, 
high-pass or band-pass frequency filters. The networks may be 
synthesized using only passive elements (18, 19| or in addition 
to resistive or capacitive components, incorporate a radio tube, 
[35] transistor, [15] or an integrated circuit operational ampli- 
fier. \4,5,14] When a filter design calls for cutoff frequencies 
below 100 Hz, several considerations tend to indicate that an 
active RC filter circuit is the most desirable type. Table 5 
lists the relative characteristics of passive and active filters 
with cutoff frequencies below 100 Hz. 

Active network synthesis can be classified in a number of 
ways, depending on the purpose of active elements and the network 
configuration. The three main types of active synthesis consist 
of a. Classical Amplifier Design where the active element is 
part of the parameters of the network. b. Feedback Systems where 
feedback theories are used to synthesize poles and zeros of a 
network function. In this case active elements are used as 
isolation or amplification devices, or as functions of oper- 
ational amplifiers. c. Modification of Passive Synthesis 
where techniques of passive synthesis are used to realize 
portions of a network that are connected together by active 
elements. In all three categories listed, the active elements 
are used mainly as controlled=source devices which perform 
functions of subtraction , negative=constant multiplier or 
inversion. They can be treated as black boxes performing their 
prescribed mathematical functions. [38 | 


The ideal low-pass filter with unity transmission below 
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TABLE 5 


COMPARISON OF PASSIVE AND ACTIVE FILTER CHARACTERISTICS 
FOR LOW FREQUENCY APPLICATIONS 


Fassive RLC or RC Filters Active RC Filters 
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and zero above a certain frequency, with no phase shift in the 
pass band, is unattainable in the real world. Three approxima- 
tions to the ideal filter can be realized by means of the 
Butterworth, Bessel or Chebyshev filters. [14| 

The Bessel filter exhibits maximally flat time delay 
(linear phase) and therefore sometimes iis used as a time delay 
network, Its amplitude response in the pass band is monotonically 
decreasing rather than flat. Its rate of fall beyond cutoff is 
less than the Butterworth or Chebyshev filters, 

The Chebyshev class of filters have an equal magnitude 
ripple in the pass band and maximum rate of fall beyond 3 db 
cutoff. The response of the filter at the cutoff frequency is 
always that of a minimum of the ripple. The allowable degree 
of ripple in the pass band can be accounted for in the filter 
design. 

The Butterworth filter is obtained by locating the poles 
of the network in accordance with the zeros of the Butterworth 
Polynomial. The normalized transfer function is of the form 

[Zo (3w)/* = 1 

1 + wen 

where n is the number of poles in the network and w is the ratio 
of frequency of interest to cutoff frequency. The filter has 
a maximally flat amplitude response in the pass band and the 
slope of rolloff outside the pass band increases directly with 
the number of poles in the transfer function. The response falls 
off at approximately a constant 6n db/octave. The phase char- 
— — of the Butterworth filter are not very linear. 


The time delay varies as a function of frequency. 
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9, LOW-PASS FILIER DESIGN 

The Fl low-pass filter and the F2 low-pass filter in AESTR 
are identical circuits. Each filter is a four pole Butterworth 
response circuit with discrete cutoff frequencies of 10, 15, 
30 and 60 Hz. The Rauch type filter network is selected since 
the circuit values are rapidly calculated for multiple filter 
sections by using the normalized tables contained in Foster's 
paper. μα Also this network can be modified to provide a 
continuous variable cutoff frequency or have positive gain by 
modifying the resistive elements of the circuit. [28| In AESTR, 
the filters have unity gain and the cutoff frequencies are 
established by switching various aapacitor values into the 
network while maintaining all resistor values at a constant value 
of 10K. The author decided to vary the capacitors rather than 
the resistors to control cutoff frequencies because of hard- 
ware considerations. As more data and experience is gained in 
the operation of AESTR, it may be desirable to design positive 
gain and continuous variable cutoff frequency into the filters 
based on recommendations of the speech therapists, Each filter 
is mounted on a separate circuit board and modifications can 
be accomplished without changing the internal chasis wiring. 

The Rauch filter basic building block is a single section 
which has two poles in the complex frequency plane. Its schematic 
and transfer function are shown in figure 8. Two cascaded 
sections are required to obtain a roll-off of 24 db per octave 
for frequencies above the cutoff frequency. A 25 uf coupling 
capacitor is inserted between sections to block D.C. components 


while a 1OK shunt resistor is inserted at the input of each 
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Two Pole Rauch Low=Pass Filter and Transfer Function 
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section to provide a D.C. return path to the base of the in- 
verting input transistor enclosed in the Fairchild uA 709 
operational amplifier. This resistor also develops the required 
input voltage necessary for proper filter response. All filter 
network resistors are fixed at 10 K ohms to provide an adequate 
filter impedance match to the mixer output and to determine 
practical capacitor values which can be obtained for fabrication 
of the network. 

From the table of normalized capacitor values for a Butter- 
worth filter with four poles, (14) » it is a simple matter to 
calculate capacitor values for various low-pass filter cutoff 
frequencies. The calculated and actual values used in the 
AESTR apparatus are listed in table 6. Although the actual 
capacitor component values deviated from the calculated values, 
the filter response is quite satisfactory. Figure 9 is a plot 
of the siteni response curves of the low=pass filters in 
AESTR. 

The various capacitors are mounted on a five pole two gang 
Switch which is gperated from the front panel of AESTR. The 
ten inch cable wires between capacitors and circuit boards do 
not contribute any noticable adverse effect on the filter re- 
sponse, 

A zero output response is observed for zero beat frequency 
output of the mixer stage due to the coupling capacitors of 
the filter. This effect does not affect the purpose for which 
AESTR is to be used since it beaprächlägliy impossible for a 
person to hold his vowel formants exactly on frequency with 


the local oscillators. The continuous deviations of the formants 


My 
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are sufficient to cause beat frequencies which will be present in the 
pass bands of the filters. 

The filter output is passed into an amplifier using a l K ohm in- 
put resistor and a 1 Megohm potentiometer in the feedback circuit 
across a uA 709 operational amplifier. The potentiometer control is 
designated as "filter sensitivity" on the front panel of AESTR. 

As stated previously, two identical low=pass filters are contained 
in the AESTR system. One filter responds to the first formant beat 
frequency and the other responds to the second formant beat frequency 
generated in their respective mixer circuits. Figure 10a is a 
schematic of the complete filter network while Figure lOb is a schematic 
of the beat frequency amplifier which drive a 0-10 volt rectifying 
voltmeter. Several typed of meters were considered for use as 
visual indicators of the beat frequency. The meters used in AESTR 
were selected simply because they were available in the stockroom 
and adequately served AESTR's purpose. 

In Figures 10a and 10b, the uA 709 operational amplifiers 
are Bima compensated in the same manner shown in the 
preamplifier schematic of Figure 6, The components have been 
omitted from the filter and amplifier circuits for the sake 
of clarity. Also the schematics identify terminals associated 
with circuit board B, Circuit board C is identical to B with 


respect to all terminal connections and component values. 
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Figure 10a. Four Pole Rauch Low-Pass Filter Schematic 
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Low-pass Filter Gain Schematic and Beat Frequency Indicator 


10, HIGH-PASS FILTER DESIGN 

In order to prevent the fundamental frequency of the vocal 
bands from passing directly through the mixer and low-pass 
filter circuits, a high-pass filter network is inserted between 
the preamplifier and mixers. Its configuration is realized 
by the Salen and Key method, [35] A highepass filter has a 


normalized frequency transfer function of 





Hout = 52 
Ein SZ + ds + 1 | (4) 


where d is the damping factor. This type of response is obtained 


from the basic high=pass filter network of Figure 11. 


=Vec 


Eout 





Figure 11. 


Basic High=Pass Filter Network 


Such a filter will give a 12 db per octave roll-off for fre- 
quencies below the cutoff frequency. Rj, Ro, Cj and Co and the 
gain of the emitter follower act together to determine the cutoff 
frequency and the shape of the response curve during the transi- 
tion from the stop band to the pass band. In the actual circuit, 


Ro is equal to the resistance of three parallel resistors.These 
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are the two bias resistors and the input resistance of the 
2N226 transistor. 

The AESTR high-pass filter schematic is shown in Figure 12. 
An emitter follower drives the high-pass filter stage which 
const&ts of two cascaded sections to yield an expected atten- 
uation of 24 db per octave. [16] The individual sections do 
yield a Butterworth response of 12 db per octave roll-off, 
but, when cascaded together, a total roll-off of only 20 db 
per octave is realized with an additional +3 db hump at the 
corner frequency. The actual response shown ín Figure 13 is 
considered satisfactory for the pitch elimination function 
in AESTR!s system, 

Note that the pitch eliminator has four discrets cutoff 
frequencies of 75, 190, 450 and 1050 Hz. The desired cutoff 
is obtained by switching in various capacitors mounted on a 
five pole, two gang switch attached to AESTR's front sane. 
The fifth position permits the high-pass filter to be bypassed 
so that AESTR can be used to discriminate between voiced and 
unvoiced consonants. This feature was incorporated into the 
apparatus after Dr. Gray operated a breadboard version of the 
system and suggested that a "pitch" or "no pitch" capability 


be incorporated into AESTR. 
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11, DECISION AND RESPONSE CIRCUIT DESIGN 

The beat frequency output of the first and second formant 
filters is applied to terminals 3 and 20 of circuit board D 
whose schematic is whown in Figure ll, These waveforms are 
Half-wave rectified and smoothed by a low-pass passive RC 
filter. The resultant D.C. voltages are impressed on the two 
input gates of an AND circuit. When both Fl and F2 beat frequency 
rectified voltages are simultaneously present and also of suffic- 
ient magnitude to cause +7.5 volts D.C. to appear on each 
diode of the AND gate, the diodes become reverse biased thereby 
directing a 400 microampere current into the base of the 2N2924 
transistor. This action drives the transistor into saturation, 
permitting a collector — of 30 milliamperes to flow 
through the relay coil, which acts as the load for the circuit, 
and closes the relay contacts. À zener diode is inserted at 
the base terminal of the transistor to prevent the transistor 
from being switched on when only one diode of the AND gate is 
reverse biased. 

The relay is a stockroom surplus item which operates on 
14 volts and 25 milliamperes, It has two sets of contacts. 
One set activates the green panel "correct" light and the other 
set connects a 115 volt supply to the appliance socket mounted 
on the rear chasis of AESTR. The Monterey Institute for Speech 
and Hearing does have a 115 volt relay operated device which 
dispenses M&M candy disks to children when they perform desired 
tasks. AESTR is able to operate this dispenser or any other 
115 volt device in response to the desired articulation of 
the child. 
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12, FABRICATION 

Economy and availability of supplies dictated oonstruction 
of AESTR. All components are housed in an aluminum case 16" 
Wide, 12" deep and 10" high. The control panel is inclined 
20% from the vertical so that the values of the control settings 
can be read with greater ease. The case was handmade in the 
student metal shop. In addition the control panel was rubbed 
with emery paper until the metal acquired a satin finish, 

The chasis for circuit components has four 22 terminal 
sockets which accept the standard 44" by 6" circuit boards, 
Also mounted on the chasis is an 11 pin socket for the power 
supply package, mounting holes for the relay plus an octal 
socket for power distribution cables and a 27 pin socket for 
signal distribution cables which originate from the components 
mounted on the rear of the control panel. Fusing is provided 
for circuit protection. 

The circuit boards are identified by letters which are: 

Board A Preamplifier, High-Pass filter, Mixers 

Board B Fl Low-Pass filter, amplifier 

Board C F2 Low-Pass filter, amplifier 

Board D Rectifier, AND circuit, transistor switch 
The functional segregation of the circuit boards permits future 
changes to the circuitry by simply replacing an entire board. 
It should not be necessary to change the internal wiring of 
the chasis for such modifications. 

Original circuit boards used for mounting of components 
were the etched contact plugboards Vector #838PWE. They are 


considered to be restrictive in flexibility. The Vector 
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#838 pepkeetched boards proved to be more versatile. Components 
are mounted easily and securely with the aid of metal washers 
riveted on the holes through which the lead wires pass through 
to the other side of the begrd. Additional holes must be 
drilled into the board to accomodate the integrated circuit 
octal socket. Learning how to properly mount components so as 
to conserve space, minimize leads and avoid gpaund loops is 
considered by the author to be a very useful and important aspect 
of this thesis. 

The electronic πα of AESTR require 30 milliamperes 
on both the plus and minus 15 volt supply terminals. When the 
relay and "correct" light are activated, the current drain 
increases to 95 milliamperes on both supply terminals. The 
power is supplied by a Power Mate Power Supply, Model DRA16- 
.2/16-.2. Its regulated output can be wet between 15 and 17 
volts and is rated to provide 200 milliamperes on the plus 
and minus terminals. The voltage regulation is excellent even 
during sudden current level changes when the light and relay 
activate. Figure 15 shows the power distribution in AESTR. 

As stated previously, the filter capacitors are: mounted 
on five pole, two gang switches. These components are located 
longitudinally around the periphery of the ais so as to 
economize on space and also obtain structural support. 

Trouble shooting the system after AESTR was completely 
wired consumed many hours. A component value error and cable 
error required correcting before successful operation of the 
assembled machine could be achieved. 


Numerous minor problems were encountered in the fabrication 
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of AESTR. These difficulties did serve to prove the fact 
that transition from theory to a practical working apparatus 


is not a trivial matter. 


13. PRELIMINARY TEST RESULTS 

AESTR was initially tested during the final phase of its 
design stage by Dr. Gray at the school electronics laboratory. 
At that time, the Fl and F2 low-pass filters had fixed cutoff 
frequencies of 50 and 100 Hz respectively. His evaluation 
of the machine indicated that the pass band of the filters 
had to be reduced in order to have the machine properly dis- 
criminate between the closely related voiced sounds such as 
ER and E, Therefore the filters were redesigned to have a 
series of discrete cutoff frequencies of 10, 15, 30 and 60 Hz. 

During this initial evaluation, it was also learned that 
air streams impinging on the microphone cause a transient re- 
sponse in AESTR of sufficient magnitude to activate the relay 
circuit. To avoid such a type of false response, the speaker 
should hold the microphone in a vertical position approximately 
four inches away from and slightly below his lips. In the case 
of a child, a BE > headset type configuration similar 
to the kind commonly worn by telephone operators would keep the 
microphone properly positioned relative to the mouth of the 
speaker. 

After its fabrication, AESTR was tested by the author. 
The machine control settings obtained for an adult male voice 
and female voice articulation of the vowel sounds are listed 
in Table 7. These settings represent the best values which 
could be obtained for the smallest spectral window in the 
Fl-F2 plane. In all cases, the first formant of the vowel was 
readily located with minimum sweeping of the Fl local oscil- 


lator. The second formant was more difficult to locate for 
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vowel sounds IY, I, and ER, The F2 local oscillator must be 
swept across its frequency range three or four times before 
the operator is certain that the F2 frequency has been located. 
This is to be expected since the amplitude of the second for- 
mants is lower than the first formant for all vowel sounds. 

Pitch measurements were made according to the procedures 
stated in section 5. Pitch frequencies are rapidly determined 
and do show a variation with the vowel sounds as indicated 
in Table 2. 

TABLE 8 

AESTR PITCH MEASUREMENTS OF AN ADULT MALE VOICE FOR VOWEL SOUNDS 
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AESTR is now on loan to the Monterey Institute for Speech 
and Hearing for field testing. Their preliminary operation 
of the apparatus in conjunction with an M&M candy dispenser 
revealed a new problem. Candy disks were being dispensed at 
a very rapid rate since the relay opened and closed every time 
the voice quived in and out of the desired sound spectral 
window. Therefore, to make AESTR provide only one reward item 
with each sustained sound, the AND circuit was modified to 
have a 250 millisecond delay before closing the relay contacts, 
and once closed, the relay would not open for two seconds. 
This modification consisted of choosing the correct shunt ca- 
apacitor values in the half-wave rectifier portion of the decision 
and response circuit, A nominal value of 100 microfarads 


working with the resistive elements of the circuit develops 
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build=up and decay time constants to meet the operating speci- 
fication for the relay. 

Dr. Gray and his associates tested AESTR for its ability 
to discriminate the individual vowel sounds. The preliminary 
results indicate that the machine, for certain vowels, will 
give a positive response to not only the targeted vowel but 
also to certain other vowel sounds. For example, AESTR can 
be set to respond to OW and it will perform properly such that 
the speaker is unable to cause a positive machine response 
with any vowel sound other than OW. However, if AESTR is 
targeted for the central vowel sound ER, the machine will 
respond to ER plus the phonemes A, OW, U, OO and UH. The 
apparent cause for this undesirable multi-sound response is 
due to the fact that ER has a relatively low intensity level 
for its first and second formants when compared to back vowels, 
fishes ra Wo Unfortunately, the therapist has a greater 
need to teach the ER rather than OW to speech bandicapped 
children. To improve AESTR's ability to respond strictly to 
the ER sound, ‚Dr. Gray and the author varied the "pitch" 
control settings. The attempt indicated that some improvement 
could be made if the "pitch" control is set to position "D', 
Now the machine will respond only to ER and OW. The OW vowel 
continues to mask all other vowels since it does contain the 
greatest amount of energy throughout the audiorfrequency spec- 
trum. 

A different approach was tried to overcome the ER ambi- 
guity response of AESTR. Both the Fl and F2 local oscillators 


were set to the second formant frequency of 1480 Hz while 
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the "pitch" control remained in position "D', The volume con- 
trol was set to a value of 3 and the sensitivity controls were 
set to a value of 4, In this state, the machine would respond 
only to the ER sound for a majority of trials. This can be 
explained by noting that OW has both its Fl and F2 frequencies 
below 1 KHz which are attenuated by the high-pass filter and 
the harmonic components of OW near 1480 Hz are insufficient 

to cause a positive response of the machine. Now, when a speaker 
makes the ER sound, its second formant (near 1480 Hz) is not 
attenuated by the high-pass filter and will provide a strong 
beat frequency out of both Fl and F2 filters thus causing 
AESTR to give a positive response. This type nfcohánhine oper- 
ating procedure will be investigated further and extended to 
take advantage of the third formant information associated 
with each vowel. 

A speaker is able to cause AESTR to give a positive response 
when he greatly increases the intensity of his voice. The 
author recommends that some type of distortionless speech 
compressor be inserted between the microphone and preamplifier. 
Commercial devices are readily availabe to control the micro= 
phone peak loudness yield. 

A human limitation prevents AESTR from being operated 
for more than 15 minutes by one speaker. After a person has 
been — "O voiced sounds for this period of time, he will 
start becoming hyperventilated and experience dizziness. The 
effect is analogous to a person blowing up a large balloon. 

Dr. Gray is giving consideration to this factor and will de- 


velop a clinical testing procedure to avoid hyperventilation 


of the speaker. 
63 


14, CONCLUSIONS 

The prototype apparatus does perform electrically in the 
manner it was designed to operate but this does not imply that 
AESTR is performing in a totally satisfactory manner from the 
viewpoint of the speech therapist.  AESTR ís considered to be 
approximately 50$ successful in meeting the needs of the 
therapkst. With more operating data obtained from the machine 
in future months, it is hoped that additional design criteria 
can be established to improve AESTR's performance, 

In addition to aiding speech handicapped children, AESTR 
has potential applications to aid persons trying to learn 
foreign vowel sounds, Also this apparatus can be used in an 
auxiliary manner to measure tones of musical instruments such 
as pianos or organs with a high degree of accuracy. 

Speech processing and especially specific analysis of 
spectral components of voiced sounds is a challenging task 
from an engineering viewpoint. This fact became very apparent 


from what appeared to be a very straight forward thesis subject. 
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APPENDIX 1 


Selected Glossary of Speech Terms 
ARTICULATE. To produce a speech sound by the organs of speech. 


ARTICULATION. The set of human bodily positions and movements 
aiming at the production of speech sounds. 


BACK. A vowel articulated by raising the back part of the 
tongue towards the velum, e.g. sort. 


CENTRAL. A vowel articulated by raising the central part of 
the tongue towards the juncture of the palate and the 
velum, e.g. first. 


CONSONANT. A speech sound articulated by a complete closure 
of the air passage or by a narrowing of it beyond the 
vowel limit, e.g. go, or see. 


DIPHTHONG. A vowel articulated by a deliberate movement of 
the speech organs from one position into the other. 


FRICATIVE. A consonant articulated by a narrowing of the air- 
passage resulting in the audible friction, e.g. shame. 


FRONT. A vowel articulated by raising the front part of the 
tongue towards the palate, e.g. get. 


FULLY VOICED. A speech sound articulated by the vocal cords 
vibrating during the whole of its articulation, e.g. 
living or put. 


ORGANS OF SPEECH. Those parts of the human body which are 
active in the production of speech sounds, i.e. the lungs 
the trachea (windpipe), the vocal cords, the glottis, 
the pharynx, the nose, the lips, the teeth, the alveoli 
(teeth ridge), the palate (hard palate), the velum (soft 
palate), the uvula, the tongue. The tongue is arbitrar- 
ily divided into four parts: the tip, the blade, the 
center and the back. 


PHONEME. A class of distinctive speech sounds, the members 
of which are (1) in complementary distribution with 
each other, and (2) in opposition or contraat to any 
other class of distinctive speech sounds. Thus, /d/ in 
read and /d/ in middle are members of the same phoneme, 
whereas /d/ in date and /1/ in late are members of two 
different phonemes. 


PHONEMICS. The scientific study of distinctive speech sounds. 


PHONETICS. The scientific study of speech sounds. 


PLOSIVE. A consonant articulated by a complete closure of the air 
passage, combined with air-compression behind the closure, and 
followed by an explosion in the release stage, e.g. kind. 


SPEECH. A sequence of sounds articulated for the purpose of human 
communication, 


SYLLABLE, A structural unit capable of being connected as a whole 
with one particular degree of accent, e.g. become. 


VELUM. The soft palate of the oral cavity. 
VOICED. <A speech sound, consonant or vowel, articulated with the 
vocal cords vibrating during the whole of its articulation, or 


part of it, e.g. weather, park, one. 


VOICELESS. A speech sound, especially a consonant, articulated with 
no voicing, e.g. lucky. 


VOWEL. A speech sound articulated with no closure of the air-passage 
and no narrowing of it beyond the vowel limit, e.g. bad or most. 


WORD. A structural unit separated in writing by spaces, e.g. bed 


(one word), room (one word), bedroom (one word), textbook 
(one word), a good subject (three words). 
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APPENDIX II 
PHOTOGRAPHS OF PROTOTYPE EQUIPMENT 
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